An efficient quantum search engine on unsorted database 
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We consider the problem of finding one or more desired items out of an unsorted database. Patel 
has shown that if the database permits quantum queries, then mere digitization is sufficient for 
efflcient search for one desired item. The algorithm, called factorized quantum search algorithm, 
presented by him can locate the desired item in an unsorted database using OilogiN) queries to 
factorized oracles. But the algorithm requires that all the property values must be distinct from 
each other. In this paper, we discuss how to make a database satisfy the requirements, and present 
a quantum search engine based on the algorithm. Our goal is achieved by introducing auxiliary files 
for the property values that are not distinct, and converting every complex query request into a 
sequence of calls to factorized quantum search algorithm. The query complexity of our algorithm is 
0{P *Q* M * log4,N), where P is the number of the potential simple query requests in the complex 
query request, Q is the maximum number of calls to the factorized quantum search algorithm of 
the simple queries, M is the number of the auxiliary files for the property on which our algorithm 
are searching for desired items. This implies that to manage an unsorted database on an actual 
quantum computer is possible and efficient. 



I. INTRODUCTION 

Consider a collection of items in a database, charac- 
terized by some property with values on the real line. 
Sorting these items means arranging them in an ordered 
sequence according to their property values. The pur- 
pose of sorting is to facilitate subsequent searches of that 
database for items with known values of the property [i( . 

The best classical algorithm uses log2N queries to find 
a desired item out of an ordered list of N items, while 
a quantum computer can solve the problem using a con- 
stant factor fewer queries. The best known lower bound 
shows that any quantum algorithm for the problem re- 
quires at least {InN — 1)/tt « 0.221/052^ queries 0. 
The algorithm obtained by Farhi, Goldstone, Gutmann, 
and Sipser, uses 3\log52N'\ queries, showing that a con- 
stant factor speedup is indeed possible [3:]. The best 
published exact quantum algorithm for the problem uses 
ilogeo^N w 0.433/052^ queries H]. 

In this paper, we consider the problem of finding one 
or more desired items out an unsorted database. Patel 
has shown that if the database quantum queries, then 
mere digitization is sufficient for efficient search for one 
desired item. The algorithm, called factorized quantum 
search algorithm, presented by him can locate the desired 
item in an unsorted database using Oilog^N) queries to 
facorized oracles [l[ . Unfortunately, the algorithm would 
not work if it is used to search more than one desired 
items out of an unsorted database, or the property values 
among which it is searching for the desired item are not 
distinct. From the point of view of unsorted database 
manipulation, we consider the problem with more than 
one desired items in an unsorted database and with the 
property values that are not distinct from each other. 



And we present a quantum search engine, based on the 
factorized quantum search algorithm, that can solve the 
problems. Our goal is achieved by introducing auxiliary 
files for the property values that are not distinct, and 
converting the searching request into a sequence of calls 
to factorized quantum search algorithm. 

The paper is organized as follows: In section II, we re- 
view quantum random access memory. Structured Query 
Language, and factorized quantum search algorithm. In 
section III, we present our quantum search engine, dis- 
cuss how to digitize the property values of an unsorted 
database, which might be distinct from each other or not, 
and discuss how to convert a complex query request into 
a sequence of calls to the factorized quantum search al- 
gorithm. We also discuss how the factorized quantum 
search algorithm works on an actual quantum computer 
with the unsorted database existing in the quantum ran- 
dom access memory. In section IV, We give several ex- 
amples to show how the quantum search engine works. 
We conclude the paper in section V. 



II. BACKGROUNDS 

A. Quantum random access memory 

A quantum random access memory (qRAM) uses 
nqubits to address any quantum superposition of N mem- 
ory cells. It means the qRAM can perform memory ac- 
cesses in coherent quantum superposition [5j : if the quan- 
tum computer needs to access a superposition of memory 
cells, the address register A must contain a superposition 
of addresses ^ aj\j)A^ and the qRAM will return a su- 
perposition of data in a data register D, correlated with 
the address register: 



' ^e-mail: newpine2002@yahoo.com.cn| 



^aj\j)A — ^^aj\j)A\dj 



'D, 



(1) 
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where dj is the content of the jth memory cell p . Giovan- 
netti, Lloyd and Maccone have presented an architecture 
that exponentially reduces the requirements for a mem- 
ory call: 0{log2N) switches need be thrown instead of 
the N used in classical RAM designs, which yields a more 
robust qRAM algorithm than conventional (classical and 
quantum) designs. See Q for more details. 



B. Structured query language 

Structured Query Language (SQL) is a tool widely 
used in manipulating the classical databases [t']. Basic 
operations in SQL include inserting a new record to 
a database file (INSERT), updating an existing record 
(UPDATE), deleting an exiting record (DELETE), se- 
lecting (SELECT) and performing an arbitrary opera- 
tion on some records, backing up a portion of a database 
(BACKUP), and restoring the backup (RESTORE) [|. 
The most commonly used operation among them might 
be the SELECT operation, which is defined as: 

SELECT [ ALL | DISTINCT [ ON { exprcssion[, ...])]] 

* I expression [ AS output_name ] [, ...] 

[ INTO [ TEMPORARY | TEMP ] [ TABLE ] new.table ] 

[ FROM fromJtem [,...]] 

[WHERE condition ] 

[ GROUP BY expression [, ...] ] 

[ HAVING condition [,...]] 

[ UNION I INTERSECT | EXCEPT [ ALL ] select ] 

[ ORDER BY expression [ ASC | DESC 

USING operator ] [,...] ] 

[FOR UPDATE [ OF class jiame [,...] ] ] 

[ LIMIT count | ALL [ OFFSET | , start ]] 

FIG. Definition of the SELECT operation 

The SELECT statement is used to form queries for 
extracting information out of the database. The details 
about how to use it and the other basic operations can 
be found in We only show several examples, about 
the SELECT operation, useful for our discussion below. 
Assume that we have such a table as follows stored in the 
database, and the name of the table is Tablel. 



ID 


Name 


Age 


960112 


Lin 


18 


960114 


Yang 


20 


960113 


Xing 


19 


960115 


Yingyu 


20 



Table. 1 



Example 1: 

SELECT ID, Name, Age 
FROM Tablel 
WHERE Age=20 



of Age is 20. This operation will return with 



ID 


Name 


Age 


960114 


Yang 


20 


960115 


Yingyu 


20 



Table. 2: Result of example 1. 

Example 2: 

SELECT ID, Name, Age 
FROM Tablel 
WHERE IDO60115 

Similarly, this operation will return with 



ID 


Name 


Age 


960112 


Lin 


18 


960114 


Yang 


20 


960113 


Yang 


19 



Table. 3: Result of example 2. 

Example 3: 

SELECT ID, Name, Age 
FROM Tablel 

WHERE IDO60115 AND Age=20 
It will return with 



ID 


Name 


Age 


960114 


Yang 


20 



Table. 4: Result of example 3. 



C. Factorized quantum search algorithm 

In [H, Patel has observed that sorting can be thought 
of as factorization of the search process and the location 
of a desired item in a sorted database can be found by 
classical queries that inspect one letter of the label at 
a time. Consider a collection of items in a database, 
characterized by some property with values on the real 
line. Sorting these items means arranging them in an 
ordered sequence according to their property values. In 
order to efficiently implement the sorting algorithm we 
need to digitize the property values. It means replacing 
the property values by integer labels and writing them 
as a string of letters belonging to a finite alphabet. In 
digital computers, this finite alphabet has size 2, and the 
letters are called bits [l| . Assume that the database have 
N = 2" items (If N is not a power of 2, then the database 
is padded up with extra labels to make N = 2".) and 
without loss of generality the desired item has the label 



It means selecting the records whose property values 



X = X1X2 ■ ■ - Xn = 00 . . . 0. 



(2) 



3 



Then the search process is equivalent to finding x such 
that 

fi^)=YlM^i) = (l-xi)(l-a;2)...(l-a;„) (3) 



equals to one. In the sorted and digitized database, one 
searches for an item by inspecting only one Xi at a time, 
i.e., evaluating f(x) by sequentially combining its factors 
fi{xi) . The functions f{x) and fi{xi) are referred to 
as oracle and factorized oracle respectively. If only the 
oracle f{x) is available then the search process requires 
0{N/2) queries even with a sorted database. The collec- 
tion of factorized oracles fi{xi) is more powerful than the 
global oracle f{x) (One can construct f(x) by combining 
all the fi{xi) together.). And with the factorized oracles, 
the search process requires only 0{lnN) queries [ij. 

In the paper, Patel has also observed that if the 
database permits quantum queries , then mere digitiza- 
tion is sufficient for efficient search with the quantum 
factorized oracles. The algorithm called factorized quan- 
tum search algorithm introduced by him is described as 



\x) = \{{P,R,F,)\'iIj 

start 1 5 



(4) 



where |-0start) = I]J=i ^^1.?)' which is the uniform su- 
perposition of all the N states. The transformations Fi, 
Ri, and Pi act on a single letter of the label only. Fi 
will evaluate to -1 when the letter and its corresponding 
letter of the desired item match and to +1 when they do 
not. 



{±1}. 



(5) 



The reflection operator Ri can also written in a factorized 
form, 



R,=h 



{Ro}i 



In, 



(6) 



where 



Rn — - 





/- 


-1 


+1 


+1 


+i\ 


1 


^ 


-1 


-1 


+1 


+1 


2 


^ 


-1 


+1 


-1 


+1 






-1 


+1 


+1 


-1/ 



(7) 



The projection/measurement operator Pi removes from 
the Hilbert space all the states with zero amplitude. Note 
that the quantum database is digitized with the size 
of the alphabet equals to 4. The factorized quantum 
search algorithm locates the desired item in an unsorted 
database using 0{logiN) queries. 

The search process inspects each letter of the label in 
turn, and decide whether it matches the corresponding 
letter of the desired string or not. The algorithm use 
Grover's algorithm d nil to do this job. Since Grover's 
algorithm requires only one query to pick one item out of 
four with certainty, it is efficient to digitize the quantum 



database with the alphabet {|00), |01), |10), |11)}. Digiti- 
zation requires that the items must be distinct from each 
other, and before digitization, we might need to pad up 
the database. We will turn back to these two problems 
later. 



III. THE QUANTUM SEARCH ENGINE 

A. The function of the engine 

The database we are going to searching for desired 
items is unsorted. It is the only difference between the 
database and a general database. Our quantum search 
engine on such a database can be demonstrated by FIG.l, 
in which the Implementation module in charge of carry- 
ing out an actual quantum database search algorithm 
(i.e. Factorized quantum search algorithm here). The 
purpose of the analysis module is to resolve the query 
conditions passed into the engine by outside programs 
such as database application programs, and initiates a 
sequence of calls to the implementation module. 



-In- 



Quantum database search engine 



Analysis 



Implementation 




-Out- 



Database 



FIG. 1: The function map of quantum search engine 

We will return to the implementation module later, 
but in the rest of this subsection, we focus on the anal- 
ysis module. From the previous section we can see that 
mere digitalization is sufficient for the factorized quan- 
tum search algorithm. The digitalization requires that 
the property values arc distinct from each other. It means 
that we can not apply the factorized quantum search al- 
gorithm directly if there are more than one potentially 
desired items in the query conditions. For example, the 
query indicated by the example 1 in section II can not 
carried out directly by implementing the factorized quan- 
tum search algorithm, because query condition of it in- 
cludes 2 potential desired records. We have used the 
fact that if one implements a query on a database man- 
agement interface(Here, one query means a SQL state- 
ment with a query condition, which is different from 
the " query" used in query complexity theory) , the corre- 
sponding query request will be sent to the search engine. 

The problem that how to digitize a database becomes 
how many potential target items in one query. To answer 
this problem we need to make query request clearer. 

Definition 1. A complex query is a SQL statement with 
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a relational expression. 



B. Digitalization 



Definition 2. A simple query is a SQL statement with 
a logical expression. 

A query condition is equivalent to a relational ex- 
pression, which combines two or more logical expres- 
sions with relational operators such as AND, OR and 
NOT. A logical expression is an expression that has two 
operands connected with a logical operator from the set 
{>,^, <,<,=, 7^} i. 

Theorem 1. Any complex query can be split into a se- 
quence of simple queries. 

To prove this theorem we need three facts: 

Fact 1: Assume that a query condition has the form of 
"CONDITIONl AND C0NDITI0N2" . We can divided 
it into two queries using "CONDITIONl" and "CONDI- 
TI0N2" respectively. They return with 2 set of records. 
The intersection of these two sets gives the result of the 
primitive query. 

For example, the query demonstrated by example 3 
can be divided into two queries as indicated by example 
1 and example 2 respectively. 

Fact 2: Assume that a query condition has the form 
of "CONDITIONl OR C0NDITI0N2" . Similarly, the 
query can be separated into two queries, and the union of 
the set returned by them gives the result of the primitive 
query. 

Fact 3: Assume that a query condition has the form of 
"NOT CONDITION". The result of the primitive query 
is the complementary set of the resulting set of records 
of the query with query condition "CONDITION". 

With these three facts, Theorem 1 is obvious. Here, 
we only show how to split a complex query into sim- 
ple query without considering the problem of query op- 
timization. Then, how many potential target records in 
a simple query? It depends on both the content of the 
database we are manipulating and the query itself. In 
the next two subsections, we will consider how the effect 
of the content of the database and how a simple query 
can be converted into a sequence of calls to the factor- 
ized quantum search algorithm. But here, we return to 
summarize the function of the search engine. It can be 
described as follows: 

a. Accept query request. 

b. Call the analysis module to analyze the query con- 
dition. 

c. Split the primitive query into simply queries. 

d. Initiate a sequence of calls to the implementation 
module. 

e. Collect the results returned by the calls and do 
further operations. 

f. Return the final result. 

In short, the quantum search engine will convert every 
complex query into a sequence of simple queries each with 
one logical expression, and finally into a sequence of calls 
to the factorized quantum search algorithm. 



Digitization means writing all the property values as 
strings of letters belonging to a finite alphabet [l| . This 
requires that all the property values, among which the 
algorithm is going to searching for desired items, must be 
distinct from each other. Digitization on the key values 
is easy, because the key values are distinct. Consider the 
case that if the content of TABLE 1 is the content of the 
database we are going to query for target records. We 
can use a simple coding scheme for this case by replacing 
960112 - 960115 by ~ 3, and digitize them as 



100) 
101) 
|10) 
111). 



(8) 



Digitization also requires that a database file must have 
iV = 4" items. If its original file does not meet this re- 
quirement, we have to pad up the database file. And we 
need a tag to identify whether or not a database file has 
been padded up. If implementation of factorized quan- 
tum search algorithm gives a record with tag indicating 
that it is a padding item, then the result would be aban- 
doned. But for simplicity, we assume the requirement is 
satisfied (In fact, we can avoid this problem by allocating 
a database file with size L * 4", where L is the size of an 
item). 

The values of common property are generally not dis- 
tinct from each other in an actual database. A solution 
to this problem is that using one or more auxiliary files 
to record the values of common property if we permit a 
search on the property. If a common property has two 
or more auxiliary files (If the property values are distinct 
from each other, then it doesn't need any auxiliary file), 
we say that the database enable queries on the property. 
In other words, a search on the values of a common prop- 
erty without the help of auxiliary file will possibly fail. 
The auxiliary file has such a structure as 



Property values Addresses in main file 



Table. 5: Structure of an auxiliary file. 

A table in a database corresponds to a main file and 
two or more auxiliary files. If the database detects that a 
manipulation on it will result in non- uniqueness of one or 
more property values, it will record the property value in 
an auxiliary file. It needs one or more auxiliary files for a 
property. The number of the auxiliary files corresponding 
to a property rests with how many property values are 
the same. 

For example, to record the property values of "Age", 
it might need such two auxiliary files as 
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Property values 


Addresses in main file 


18 


1 


20 


2 


19 


3 


Table. 6, A 


Property values 


Addresses in main file 


20 


4 



Table. 6, B 
Table. 6: Auxiliary files for table 1. 



Here, we suppose the addresses of the four records in 
table 1 are 1, 2, 3, and 4. Note that all the property 
values in an auxiliary file should be distinct from each 
other. So we can treat the property values in a auxiliary 
file in the way as the key values when digitalizing. 



C. Initiation of calls to implementation module 

In this subsection, we focus on a simple query with only 
a logical expression as its query condition. We have said 
that a logical expression is an expression that has two 
operands connected with a logical operator from the set 
{>,^,<,<,=,f^}[8]. A logical expression in a simply 
query has such a form as " x o a" , where x is a property 
variable, o is a logical operator, and a is a given property 
value. Assume that the definition domain of x is [xi, X2]. 
The algorithm of initiating calls to implementation mod- 
ule can be described as the following subroutines each 
corresponding to a type of logical expression. 

f(x,=,a): 

a. If X is a key property, then call the implementation 
module with the query condition on the main file. 

b. Else if the search on the property x is not allow- 
able, i.e., its property values are not distinct from each 
other and no auxiliary file are available, then return with 
failure. 

c. For Every auxiliary file, call the implementation 
module with the query condition. 

d. Return with the union set of the results of all calls. 
f(x,^,a): 

a. Foreveryy in [a;i,X2] and unequal to a, call f(x,=,y). 

b. Return with the union set of the results of all calls. 
f(x,<,a): 

a. For Every y in [xi,a), call f(x,=,y). 

b. Return with the union set of the result of all calls. 

The subroutines for the rest types of logical expres- 
sion, i.e., f(x,^,a), f(x,>,a), and f(x,^,a), are similar to 
f(x,<,a). 



D. The implementation module 

We have known that the implementation module car- 
rys out the factorized quantum search algorithm. So, 
we turn to see how the factorized quantum search algo- 
rithm work on an actual quantum computer. Assume 
that our database exists in a qRAM, the content of the 
database is table 1, and we want to find the record that 
"ID=960112". The four property values of "ID" are en- 
coded as 

960112 — > |00) 

960113 — > 101) 

(9) 

960114 — > |10) ^ ' 

960115 — > 111). 

First, we need to prepare the content of the database 
in a uniform state, we can do this job by setting the 
address register A in the state ^(\a) + \b) + \c) + \d)), 
where a, b, c, d, are the corresponding addresses of the 
four records. And the qRAM will return a superposition 
of data in a data register D, correlated with the address 
register 

^(|00) (g) \a) + |01) ® \b) + |10) (g) |c) + 111) (g) \d)) (10) 

It is easy to see that it takes only one oracle query for 
the factorized quantum search algorithm to find the de- 
sired item, and the resulting state of the register would be 
1 00) ® |a). Measurement on the register gives the desired 
item with certainty. 

E. query complexity and space consumption 

If a property has t possible values, and ji denotes 
the number of items with the property value pi, where 
i ranges from 1 to t, then the number of the auxiliary 
files is M = ■max{ji, j2, ■ ■ ■ ,jt)- Let the size of the M 
auxiliary files he Ni, N2, ■ ■ ■ , Nm, then the total memory 
space for all the M auxiliary files is A^i -I- 7V2 + • ■ ■ + Nm , 
which is 4N in the worst case (Note that we need to pad 
every auxiliary file such that Ni is the power of 4). 

Theorem 2. The query complexity of Our algorithm for 
the quantum search engine responding to one complex 
query request is at most 0{P * Q * M * log^N), where 
P is the number of the potential simple query requests 
in the complex query request, Q is the maximum number 
of calls to the factorized quantum search algorithm, M is 
the number of the auxiliary files for the property on which 
our algorithm are searching for desired items. 

Proof: Because a complex query is converted into P*Q 
calls to the factorized quantum search algorithm on every 
auxiliary files, the total number of calls to the factorized 
quantum search algorithm is P * Q * M. And the query 
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complexity of the factorized quantum search algorithm is 
Oilog^N) py]. So, the total query complexity correspond- 
ing to a complex query is at most 0{P * Q * M * log^N). 

If the values of the property, on which our algorithm 
are implementing, are distinct from each other, then the 
query complexity are 0{P * Q * logiN). However, the 
query complexity of a typical search engine using the 
best classical search algorithm mentioned above is 0{P* 
Q * log2N) for a complex query. 

IV. EXAMPLES 

We take the example 3 in section II as an example 
to show how our quantum search engine works. After 
the engine accept the query request, it will first call the 
analysis module to analyze the query condition, and split 
the primitive query into simply queries demonstrated by 
example 1 and 2. Let us see how the engine deals with the 
example 1 and 2. Assume that "ID" is the key property 
and the database enable queries on the property " Age" . 

Search process of example 1: 

a. The engine calls the subroutine f(Age,=,20). 

b. Because "Age" is not the key property, then for 
every auxiliary files(The auxiliary files can be seen in 
table 6.), call the implementation module with the query 
condition. 

c. For the first auxiliary file, the implementation mod- 
ule returns with 



ID 


Name 


Age 


960114 


Yang 


20 



For the second auxiliary file, the implementation mod- 
ule returns with 



ID 


Name 


Age 


960115 


Yingyu 


20 



d. The result of example 1, which can seen in table 2, 
is the union of these two results. 
Search process of example 2: 

a. The engine calls the subroutine f(ID,<, 960115). 

b. The subroutine then caU f(ID, =,960112), 
f(ID,=,960113), and f(ID,=,960114). Note that ID is a 
key property, then f(ID,=,960112) and f(ID,=,960113) 
call the implementation module with the corresponding 
query condition respectively on the main file. 

c. f(ID,=,960112) returns with 



ID 


Name 


Age 


960112 


Lin 


18 



f(A,=,960113) returns with 



ID 


Name 


Age 


960113 


Lin 


19 



f(A,=,960114) returns with 



ID 


Name 


Age 


960114 


Lin 


20 



The union of the results of f(ID, =,960112), 
f(A,=,960113), and f(A,=,960114) gives the results 
of example 2, which can be seen in table 3. And the 
intersection of the results of example 1 and 2 is the 
result of example 3, which can be seen in table 4. 



V. CONCLUSION 

We have considered the problem of finding one or more 
desired items out of an unsorted database. The factorized 
quantum search algorithm presented by Patel can locate 
one desired item in an unsorted database using 0(log4N) 
queries to factorized oracles. But the algorithm can only 
solve the problem finding the desired item out of an un- 
sorted database with only one target item. We extend the 
algorithm to solve the problem with more than one tar- 
get items in an unsorted database. The goal is achieved 
by introducing auxiliary file, and converting a complex 
query request into a sequence of simple queries, and fi- 
nally into a sequence of calls to the factorized quantum 
search algorithm. 

The query complexity of our algorithm is 0{P * Q * 
M * log^N), where P is the number of the potential sim- 
ple query requests in the complex query request, Q is 
the maximum number of calls to the factorized quantum 
search algorithm of the simple queries, M is the number 
of the auxiliary files for the property on which our algo- 
rithm are searching for desired items. We know that the 
query complexity of the best classical algorithm to select 
P*Q items from a sorted database is P*Q*log2N . When 
compared with the classical algorithm on either sorted or 
unsorted database, if M is small, the quantum search en- 
gine on unsorted database presented in the paper is more 
efficient. 

The most important thing implied in this paper is that 
to manage an unsorted database on a quantum computer 
is possible and efficient. Though a query algorithm on a 
sorted database can be more efficient than a query algo- 
rithm on an unsorted database, the sorting price might 
be very high. So the quantum search algorithm presented 
in this paper offers another way to database management 
on an actual quantum computer. 
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